Goto

Collaborating Authors

 backward elimination


Automated Model Selection for Tabular Data

arXiv.org Artificial Intelligence

Structured data in the form of tabular datasets contain features that are distinct and discrete, with varying individual and relative importances to the target. Combinations of one or more features may be more predictive and meaningful than simple individual feature contributions. R's mixed effect linear models library allows users to provide such interactive feature combinations in the model design. However, given many features and possible interactions to select from, model selection becomes an exponentially difficult task. We aim to automate the model selection process for predictions on tabular datasets incorporating feature interactions while keeping computational costs small. The framework includes two distinct approaches for feature selection: a Priority-based Random Grid Search and a Greedy Search method. The Priority-based approach efficiently explores feature combinations using prior probabilities to guide the search. The Greedy method builds the solution iteratively by adding or removing features based on their impact. Experiments on synthetic demonstrate the ability to effectively capture predictive feature combinations.


Outlier Explanation via Sum-Product Networks

arXiv.org Artificial Intelligence

Outlier explanation is the task of identifying a set of features that distinguish a sample from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They quickly becomes computationally expensive, as they require to run an outlier detection algorithm from scratch for each feature subset. To alleviate this problem, we propose a novel outlier explanation algorithm based on Sum-Product Networks (SPNs), a class of probabilistic circuits. Our approach leverages the tractability of marginal inference in SPNs to compute outlier scores in feature subsets. By using SPNs, it becomes feasible to perform backwards elimination instead of the usual forward beam search, which is less susceptible to missing relevant features in an explanation, especially when the number of features is large. We empirically show that our approach achieves state-of-the-art results for outlier explanation, outperforming recent search-based as well as deep learning-based explanation methods


Feature Selection using Wrapper Method with Python Implementation

#artificialintelligence

In today's era of Big data and IoT, we are easily loaded with rich datasets having extremely high dimensions. In order to perform any machine learning task or to get insights from such high dimensional data, feature selection becomes very important. Increase in complexity of a model and makes it harder to interpret. Increase in time complexity for a model to get trained. Hence, it gives an indispensable need to perform feature selection.


Feature selection using Wrapper methods in Python

#artificialintelligence

In today's era of Big data and IoT, we are easily loaded with rich datasets having extremely high dimensions. In order to perform any machine learning task or to get insights from such high dimensional data, feature selection becomes very important. Hence, it gives an indispensable need to perform feature selection. Feature selection is very crucial and must component in machine learning and data science workflows especially while dealing with high dimensional datasets. As the name suggests, it is a process of selecting the most significant and relevant features from a vast set of features in the given dataset.


Feature Selection with sklearn and Pandas

#artificialintelligence

Feature selection is one of the first and important steps while performing any machine learning task. A feature in case of a dataset simply means a column. When we get any dataset, not necessarily every column (feature) is going to have an impact on the output variable. If we add these irrelevant features in the model, it will just make the model worst (Garbage In Garbage Out). This gives rise to the need of doing feature selection.


Evolutionary Algorithms for Feature Selection

@machinelearnbot

Feature selection is a very important technique in machine learning.We need to be able to solve it to produce models. Feature Selection requires heuristic processes to find an optimal machine learning subset. In the previous post we discussed the brute force algorithm as well as forward selection and backward elimination which were both not a great fit. What other options are there? We can use one of the most common optimization algorithms for multi-modal fitness landscapes: evolutionary algorithms. Evolutionary algorithm is a generic optimization technique mimicking the ideas of natural evolution.


A Review of Feature Selection Methods Based on Mutual Information

arXiv.org Machine Learning

In this work we present a review of the state of the art of information theoretic feature selection methods. The concepts of feature relevance, redundance and complementarity (synergy) are clearly defined, as well as Markov blanket. The problem of optimal feature selection is defined. A unifying theoretical framework is described, which can retrofit successful heuristic criteria, indicating the approximations made by each method. A number of open problems in the field are presented.


A note relating ridge regression and OLS p-values to preconditioned sparse penalized regression

arXiv.org Machine Learning

When the design matrix has orthonormal columns, "soft thresholding" the ordinary least squares (OLS) solution produces the Lasso solution [Tibshirani, 1996]. If one uses the Puffer preconditioned Lasso [Jia and Rohe, 2012], then this result generalizes from orthonormal designs to full rank designs (Theorem 1). Theorem 2 refines the Puffer preconditioner to make the Lasso select the same model as removing the elements of the OLS solution with the largest p-values. Using a generalized Puffer preconditioner, Theorem 3 relates ridge regression to the preconditioned Lasso; this result is for the high dimensional setting, p > n. Where the standard Lasso is akin to forward selection [Efron et al., 2004], Theorems 1, 2, and 3 suggest that the preconditioned Lasso is more akin to backward elimination. These results hold for sparse penalties beyond l1; for a broad class of sparse and non-convex techniques (e.g. SCAD and MC+), the results hold for all local minima.


Markov Blanket Ranking using Kernel-based Conditional Dependence Measures

arXiv.org Machine Learning

Developing feature selection algorithms that move beyond a pure correlational to a more causal analysis of observational data is an important problem in the sciences. Several algorithms attempt to do so by discovering the Markov blanket of a target, but they all contain a forward selection step which variables must pass in order to be included in the conditioning set. As a result, these algorithms may not consider all possible conditional multivariate combinations. We improve on this limitation by proposing a backward elimination method that uses a kernel-based conditional dependence measure to identify the Markov blanket in a fully multivariate fashion. The algorithm is easy to implement and compares favorably to other methods on synthetic and real datasets.